Decision Tree Pruning Using Expert Knowledge
نویسندگان
چکیده
Decision tree technology has proven to be a valuable way of capturing human decision making within a computer. It has long been a popular artificial intelligence(AI) technique. During the 1980s, it was one of the primary ways for creating an AI system. During the early part of the 1990s, it somewhat fell out of favor, as did the entire AI field in general. However, during the later 1990s, with the emergence of data mining technology, the technique has resurfaced as a powerful method for creating a decision-making program. How to prune the decision tree is one of the research directions of the decision tree technique, but the idea of cost-sensitive pruning has received much less attention than other pruning techniques even though additional flexibility and increased performance can be obtained from this method. This dissertation reports on a study of cost-sensitive methods for decision tree pruning. A decision tree pruning algorithm called KBP1.0, which includes four cost-sensitive methods, is developed. The intelligent inexact classification is used for first time in KBP1.0 to prune the decision tree. Using expert knowledge in decision tree pruning is discussed for the first time. By comparing the cost-sensitive pruning methods in KBP1.0 with other traditional pruning methods, such as reduced error pruning, pessimistic error pruning, cost complexity pruning, and C4.5, on benchmark data sets, the advantage and disadvantage of cost-sensitive methods in KBP1.0 have been summarized. This research will enhance our understanding of the theory, design and implementation of decision tree pruning using expert knowledge. In the future, the cost-sensitive pruning methods can be integrated into other pruning methods, such as minimum error pruning and critical value pruning, and include new iii pruning methods in KBP. Using KBP to prune the decision tree and getting the rules from the pruned tree to help us build the expert system is another direction of our future work.
منابع مشابه
Use of Expert Knowledge for Decision Tree Pruning
Decision tree technology has been proven to be a valuable way of capturing human decision making within a computer. One main problem for many traditional decision tree pruning methods is that it is always assumed that all misclassifications are equally probable and equally serious. However, in a real-world classification problem, there may be a cost associated with misclassifying examples from ...
متن کاملError-Based Pruning of Decision Trees Grown on Very Large Data Sets Can Work!
It has been asserted that, using traditional pruning methods, growing decision trees with increasingly larger amounts of training data will result in larger tree sizes even when accuracy does not increase. With regard to error-based pruning, the experimental data used to illustrate this assertion have apparently been obtained using the default setting for pruning strength; in particular, using ...
متن کاملMachine Learning Predicting nearly as well as the best pruning of a decision tree
Many algorithms for inferring a decision tree from data involve a two phase process First a very large decision tree is grown which typically ends up over tting the data To reduce over tting in the second phase the tree is pruned using one of a number of available methods The nal tree is then output and used for classi cation on test data In this paper we suggest an alternative approach to the ...
متن کاملEighth Annual Conference on Computational Learning Theory , July
Many algorithms for inferring a decision tree from data involve a two-phase process: First, a very large decision tree is grown which typically ends up \over-tting" the data. To reduce over-tting, in the second phase, the tree is pruned using one of a number of available methods. The nal tree is then output and used for classiication on test data. In this paper, we suggest an alternative approa...
متن کاملEvaluation of liquefaction potential based on CPT results using C4.5 decision tree
The prediction of liquefaction potential of soil due to an earthquake is an essential task in Civil Engineering. The decision tree is a tree structure consisting of internal and terminal nodes which process the data to ultimately yield a classification. C4.5 is a known algorithm widely used to design decision trees. In this algorithm, a pruning process is carried out to solve the problem of the...
متن کامل